Dataset statistics
| Number of variables | 13 |
|---|---|
| Number of observations | 9380 |
| Missing cells | 15146 |
| Missing cells (%) | 12.4% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 3.6 MiB |
| Average record size in memory | 407.5 B |
Variable types
| NUM | 8 |
|---|---|
| CAT | 5 |
Tax Article has constant value "9380" | Constant |
Credit Name has a high cardinality: 57 distinct values | High cardinality |
Notes has 7193 (76.7%) missing values | Missing |
Number of Taxpayers has 1553 (16.6%) missing values | Missing |
Amount of Credit has 1553 (16.6%) missing values | Missing |
Percent of Credit has 1553 (16.6%) missing values | Missing |
Median Amount of Credit has 1741 (18.6%) missing values | Missing |
Mean Amount of Credit has 1553 (16.6%) missing values | Missing |
Median Amount of Credit is highly skewed (γ1 = 27.66914305) | Skewed |
Mean Amount of Credit is highly skewed (γ1 = 30.7762013) | Skewed |
Number of Taxpayers has 3222 (34.3%) zeros | Zeros |
Amount of Credit has 3222 (34.3%) zeros | Zeros |
Percent of Credit has 3225 (34.4%) zeros | Zeros |
Median Amount of Credit has 3222 (34.3%) zeros | Zeros |
Mean Amount of Credit has 3222 (34.3%) zeros | Zeros |
Reproduction
| Analysis started | 2020-12-13 00:05:00.368190 |
|---|---|
| Analysis finished | 2020-12-13 00:05:07.535358 |
| Duration | 7.17 seconds |
| Software version | pandas-profiling v2.9.0 |
| Download configuration | config.yaml |
Tax Year
Real number (ℝ≥0)
| Distinct | 16 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2011.035714 |
|---|---|
| Minimum | 2001 |
| Maximum | 2016 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 73.4 KiB |
Quantile statistics
| Minimum | 2001 |
|---|---|
| 5-th percentile | 2002 |
| Q1 | 2008 |
| median | 2012 |
| Q3 | 2014 |
| 95-th percentile | 2016 |
| Maximum | 2016 |
| Range | 15 |
| Interquartile range (IQR) | 6 |
Descriptive statistics
| Standard deviation | 4.203266868 |
|---|---|
| Coefficient of variation (CV) | 0.002090100558 |
| Kurtosis | -0.3098980137 |
| Mean | 2011.035714 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | -0.8642538226 |
| Sum | 18863515 |
| Variance | 17.66745236 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=16)
| Value | Count | Frequency (%) | |
| 2014 | 1175 | 12.5% | |
| 2013 | 1075 | 11.5% | |
| 2012 | 1075 | 11.5% | |
| 2015 | 1040 | 11.1% | |
| 2016 | 1040 | 11.1% | |
| 2011 | 1000 | 10.7% | |
| 2006 | 325 | 3.5% | |
| 2005 | 325 | 3.5% | |
| 2007 | 315 | 3.4% | |
| 2010 | 315 | 3.4% | |
| 2009 | 310 | 3.3% | |
| 2008 | 310 | 3.3% | |
| 2004 | 280 | 3.0% | |
| 2003 | 280 | 3.0% | |
| 2002 | 275 | 2.9% | |
| 2001 | 240 | 2.6% |
| Value | Count | Frequency (%) | |
| 2001 | 240 | 2.6% | |
| 2002 | 275 | 2.9% | |
| 2003 | 280 | 3.0% | |
| 2004 | 280 | 3.0% | |
| 2005 | 325 | 3.5% | |
| 2006 | 325 | 3.5% | |
| 2007 | 315 | 3.4% | |
| 2008 | 310 | 3.3% | |
| 2009 | 310 | 3.3% | |
| 2010 | 315 | 3.4% |
| Value | Count | Frequency (%) | |
| 2016 | 1040 | 11.1% | |
| 2015 | 1040 | 11.1% | |
| 2014 | 1175 | 12.5% | |
| 2013 | 1075 | 11.5% | |
| 2012 | 1075 | 11.5% | |
| 2011 | 1000 | 10.7% | |
| 2010 | 315 | 3.4% | |
| 2009 | 310 | 3.3% | |
| 2008 | 310 | 3.3% | |
| 2007 | 315 | 3.4% |
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 73.4 KiB |
| 9A |
|---|
| Value | Count | Frequency (%) | |
| 9A | 9380 | 100.0% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 2 |
|---|---|
| Median length | 2 |
| Mean length | 2 |
| Min length | 2 |
Most occurring characters
| Value | Count | Frequency (%) | |
| 9 | 9380 | 50.0% | |
| A | 9380 | 50.0% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Decimal Number | 9380 | 50.0% | |
| Uppercase Letter | 9380 | 50.0% |
Most frequent Decimal Number characters
| Value | Count | Frequency (%) | |
| 9 | 9380 | 100.0% |
Most frequent Uppercase Letter characters
| Value | Count | Frequency (%) | |
| A | 9380 | 100.0% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Common | 9380 | 50.0% | |
| Latin | 9380 | 50.0% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 9 | 9380 | 100.0% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| A | 9380 | 100.0% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 18760 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| 9 | 9380 | 50.0% | |
| A | 9380 | 50.0% |
Credit Type
Categorical
| Distinct | 5 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 73.4 KiB |
| Credit Used | |
|---|---|
| Credit Earned | |
| Credit Claimed | |
| Credit Carried Forward | |
| Credit Refunded |
| Value | Count | Frequency (%) | |
| Credit Used | 2115 | 22.5% | |
| Credit Earned | 2096 | 22.3% | |
| Credit Claimed | 1841 | 19.6% | |
| Credit Carried Forward | 1796 | 19.1% | |
| Credit Refunded | 1532 | 16.3% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 22 |
|---|---|
| Median length | 14 |
| Mean length | 14.79520256 |
| Min length | 11 |
Most occurring characters
| Value | Count | Frequency (%) | |
| d | 22088 | 15.9% | |
| e | 20292 | 14.6% | |
| r | 18660 | 13.4% | |
| C | 13017 | 9.4% | |
| i | 13017 | 9.4% | |
| 11176 | 8.1% | ||
| t | 9380 | 6.8% | |
| a | 7529 | 5.4% | |
| n | 3628 | 2.6% | |
| U | 2115 | 1.5% | |
| s | 2115 | 1.5% | |
| E | 2096 | 1.5% | |
| l | 1841 | 1.3% | |
| m | 1841 | 1.3% | |
| F | 1796 | 1.3% | |
| o | 1796 | 1.3% | |
| w | 1796 | 1.3% | |
| R | 1532 | 1.1% | |
| f | 1532 | 1.1% | |
| u | 1532 | 1.1% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Lowercase Letter | 107047 | 77.1% | |
| Uppercase Letter | 20556 | 14.8% | |
| Space Separator | 11176 | 8.1% |
Most frequent Uppercase Letter characters
| Value | Count | Frequency (%) | |
| C | 13017 | 63.3% | |
| U | 2115 | 10.3% | |
| E | 2096 | 10.2% | |
| F | 1796 | 8.7% | |
| R | 1532 | 7.5% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| d | 22088 | 20.6% | |
| e | 20292 | 19.0% | |
| r | 18660 | 17.4% | |
| i | 13017 | 12.2% | |
| t | 9380 | 8.8% | |
| a | 7529 | 7.0% | |
| n | 3628 | 3.4% | |
| s | 2115 | 2.0% | |
| l | 1841 | 1.7% | |
| m | 1841 | 1.7% | |
| o | 1796 | 1.7% | |
| w | 1796 | 1.7% | |
| f | 1532 | 1.4% | |
| u | 1532 | 1.4% |
Most frequent Space Separator characters
| Value | Count | Frequency (%) | |
| 11176 | 100.0% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 127603 | 91.9% | |
| Common | 11176 | 8.1% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| d | 22088 | 17.3% | |
| e | 20292 | 15.9% | |
| r | 18660 | 14.6% | |
| C | 13017 | 10.2% | |
| i | 13017 | 10.2% | |
| t | 9380 | 7.4% | |
| a | 7529 | 5.9% | |
| n | 3628 | 2.8% | |
| U | 2115 | 1.7% | |
| s | 2115 | 1.7% | |
| E | 2096 | 1.6% | |
| l | 1841 | 1.4% | |
| m | 1841 | 1.4% | |
| F | 1796 | 1.4% | |
| o | 1796 | 1.4% | |
| w | 1796 | 1.4% | |
| R | 1532 | 1.2% | |
| f | 1532 | 1.2% | |
| u | 1532 | 1.2% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 11176 | 100.0% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 138779 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| d | 22088 | 15.9% | |
| e | 20292 | 14.6% | |
| r | 18660 | 13.4% | |
| C | 13017 | 9.4% | |
| i | 13017 | 9.4% | |
| 11176 | 8.1% | ||
| t | 9380 | 6.8% | |
| a | 7529 | 5.4% | |
| n | 3628 | 2.6% | |
| U | 2115 | 1.5% | |
| s | 2115 | 1.5% | |
| E | 2096 | 1.5% | |
| l | 1841 | 1.3% | |
| m | 1841 | 1.3% | |
| F | 1796 | 1.3% | |
| o | 1796 | 1.3% | |
| w | 1796 | 1.3% | |
| R | 1532 | 1.1% | |
| f | 1532 | 1.1% | |
| u | 1532 | 1.1% |
| Distinct | 57 |
|---|---|
| Distinct (%) | 0.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 73.4 KiB |
| Farmers' School Tax Credit | 370 |
|---|---|
| EZ/QEZE Tax Credits - EZ Wage Tax Credit | 370 |
| EZ/QEZE Tax Credits - EZ Investment Tax Credit | 370 |
| Investment Tax Credit | 370 |
| EZ/QEZE Tax Credits - EZ Capital Tax Credit | 340 |
| Other values (52) |
| Value | Count | Frequency (%) | |
| Farmers' School Tax Credit | 370 | 3.9% | |
| EZ/QEZE Tax Credits - EZ Wage Tax Credit | 370 | 3.9% | |
| EZ/QEZE Tax Credits - EZ Investment Tax Credit | 370 | 3.9% | |
| Investment Tax Credit | 370 | 3.9% | |
| EZ/QEZE Tax Credits - EZ Capital Tax Credit | 340 | 3.6% | |
| Investment Tax Credit for the Financial Services Industry | 340 | 3.6% | |
| Special Additional Mortgage Recording Tax Credit | 340 | 3.6% | |
| Long-Term Care Insurance Credit | 315 | 3.4% | |
| EZ/QEZE Tax Credits - ZEA Wage Credit | 310 | 3.3% | |
| QETC Employment Credit | 305 | 3.3% | |
| EZ/QEZE Tax Credits - QEZE Credit for Real Property Taxes | 295 | 3.1% | |
| EZ/QEZE Tax Credits - QEZE Credit for Real Property Taxes For Corporate Partners | 260 | 2.8% | |
| EZ/QEZE Tax Credits - QEZE Tax Reduction Credit | 240 | 2.6% | |
| EZ/QEZE Tax Credits - QEZE Tax Reduction Credit For Corporate Partners | 230 | 2.5% | |
| Employment of Persons with Disabilities Tax Credit | 200 | 2.1% | |
| QETC Facilities, Operations, and Training Credit | 160 | 1.7% | |
| Conservation Easement Tax Credit | 140 | 1.5% | |
| Clean Heating Fuel Credit | 140 | 1.5% | |
| Fuel Cell Electric Generating Equipment Credit | 140 | 1.5% | |
| Credit for Purchase of an Automated External Defibrillator | 140 | 1.5% | |
| Brownfield Tax Credits - Environmental Remediation Insurance Tax Credit | 140 | 1.5% | |
| Biofuel Production Credit | 140 | 1.5% | |
| QETC Capital Tax Credit | 140 | 1.5% | |
| Brownfield Tax Credits - Remediation Real Property Tax Credit | 140 | 1.5% | |
| Empire State Film Production Credit | 140 | 1.5% | |
| Other values (32) | 3305 | 35.2% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 89 |
|---|---|
| Median length | 42 |
| Mean length | 44.30383795 |
| Min length | 21 |
Most occurring characters
| Value | Count | Frequency (%) | |
| 52880 | 12.7% | ||
| e | 38345 | 9.2% | |
| r | 31285 | 7.5% | |
| i | 29970 | 7.2% | |
| t | 29500 | 7.1% | |
| a | 23230 | 5.6% | |
| d | 17440 | 4.2% | |
| o | 16345 | 3.9% | |
| n | 16340 | 3.9% | |
| C | 14985 | 3.6% | |
| E | 14410 | 3.5% | |
| s | 12665 | 3.0% | |
| T | 10925 | 2.6% | |
| x | 9605 | 2.3% | |
| l | 9395 | 2.3% | |
| Z | 7665 | 1.8% | |
| m | 7475 | 1.8% | |
| c | 7345 | 1.8% | |
| u | 4850 | 1.2% | |
| p | 4320 | 1.0% | |
| g | 4220 | 1.0% | |
| Q | 4185 | 1.0% | |
| - | 4060 | 1.0% | |
| f | 3890 | 0.9% | |
| P | 3875 | 0.9% | |
| Other values (38) | 36365 | 8.8% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Lowercase Letter | 277820 | 66.9% | |
| Uppercase Letter | 74320 | 17.9% | |
| Space Separator | 52880 | 12.7% | |
| Other Punctuation | 4305 | 1.0% | |
| Dash Punctuation | 4060 | 1.0% | |
| Decimal Number | 2120 | 0.5% | |
| Control | 65 | < 0.1% |
Most frequent Uppercase Letter characters
| Value | Count | Frequency (%) | |
| C | 14985 | 20.2% | |
| E | 14410 | 19.4% | |
| T | 10925 | 14.7% | |
| Z | 7665 | 10.3% | |
| Q | 4185 | 5.6% | |
| P | 3875 | 5.2% | |
| R | 2850 | 3.8% | |
| F | 2430 | 3.3% | |
| I | 2405 | 3.2% | |
| S | 2380 | 3.2% | |
| A | 1550 | 2.1% | |
| B | 1065 | 1.4% | |
| M | 860 | 1.2% | |
| W | 785 | 1.1% | |
| D | 760 | 1.0% | |
| L | 595 | 0.8% | |
| O | 480 | 0.6% | |
| H | 460 | 0.6% | |
| J | 370 | 0.5% | |
| Y | 360 | 0.5% | |
| G | 280 | 0.4% | |
| V | 270 | 0.4% | |
| N | 245 | 0.3% | |
| U | 130 | 0.2% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| e | 38345 | 13.8% | |
| r | 31285 | 11.3% | |
| i | 29970 | 10.8% | |
| t | 29500 | 10.6% | |
| a | 23230 | 8.4% | |
| d | 17440 | 6.3% | |
| o | 16345 | 5.9% | |
| n | 16340 | 5.9% | |
| s | 12665 | 4.6% | |
| x | 9605 | 3.5% | |
| l | 9395 | 3.4% | |
| m | 7475 | 2.7% | |
| c | 7345 | 2.6% | |
| u | 4850 | 1.7% | |
| p | 4320 | 1.6% | |
| g | 4220 | 1.5% | |
| f | 3890 | 1.4% | |
| v | 3355 | 1.2% | |
| y | 2435 | 0.9% | |
| h | 2245 | 0.8% | |
| b | 1895 | 0.7% | |
| w | 1375 | 0.5% | |
| k | 155 | 0.1% | |
| q | 140 | 0.1% |
Most frequent Space Separator characters
| Value | Count | Frequency (%) | |
| 52880 | 100.0% |
Most frequent Dash Punctuation characters
| Value | Count | Frequency (%) | |
| - | 4060 | 100.0% |
Most frequent Decimal Number characters
| Value | Count | Frequency (%) | |
| 1 | 360 | 17.0% | |
| 6 | 280 | 13.2% | |
| 2 | 280 | 13.2% | |
| 3 | 280 | 13.2% | |
| 0 | 280 | 13.2% | |
| 8 | 280 | 13.2% | |
| 7 | 180 | 8.5% | |
| 5 | 180 | 8.5% |
Most frequent Other Punctuation characters
| Value | Count | Frequency (%) | |
| / | 3475 | 80.7% | |
| ' | 370 | 8.6% | |
| , | 320 | 7.4% | |
| & | 140 | 3.3% |
Most frequent Control characters
| Value | Count | Frequency (%) | |
| | 65 | 100.0% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 352140 | 84.7% | |
| Common | 63430 | 15.3% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| e | 38345 | 10.9% | |
| r | 31285 | 8.9% | |
| i | 29970 | 8.5% | |
| t | 29500 | 8.4% | |
| a | 23230 | 6.6% | |
| d | 17440 | 5.0% | |
| o | 16345 | 4.6% | |
| n | 16340 | 4.6% | |
| C | 14985 | 4.3% | |
| E | 14410 | 4.1% | |
| s | 12665 | 3.6% | |
| T | 10925 | 3.1% | |
| x | 9605 | 2.7% | |
| l | 9395 | 2.7% | |
| Z | 7665 | 2.2% | |
| m | 7475 | 2.1% | |
| c | 7345 | 2.1% | |
| u | 4850 | 1.4% | |
| p | 4320 | 1.2% | |
| g | 4220 | 1.2% | |
| Q | 4185 | 1.2% | |
| f | 3890 | 1.1% | |
| P | 3875 | 1.1% | |
| v | 3355 | 1.0% | |
| R | 2850 | 0.8% | |
| Other values (23) | 23670 | 6.7% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 52880 | 83.4% | ||
| - | 4060 | 6.4% | |
| / | 3475 | 5.5% | |
| ' | 370 | 0.6% | |
| 1 | 360 | 0.6% | |
| , | 320 | 0.5% | |
| 6 | 280 | 0.4% | |
| 2 | 280 | 0.4% | |
| 3 | 280 | 0.4% | |
| 0 | 280 | 0.4% | |
| 8 | 280 | 0.4% | |
| 7 | 180 | 0.3% | |
| 5 | 180 | 0.3% | |
| & | 140 | 0.2% | |
| | 65 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 415505 | > 99.9% | |
| None | 65 | < 0.1% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| 52880 | 12.7% | ||
| e | 38345 | 9.2% | |
| r | 31285 | 7.5% | |
| i | 29970 | 7.2% | |
| t | 29500 | 7.1% | |
| a | 23230 | 5.6% | |
| d | 17440 | 4.2% | |
| o | 16345 | 3.9% | |
| n | 16340 | 3.9% | |
| C | 14985 | 3.6% | |
| E | 14410 | 3.5% | |
| s | 12665 | 3.0% | |
| T | 10925 | 2.6% | |
| x | 9605 | 2.3% | |
| l | 9395 | 2.3% | |
| Z | 7665 | 1.8% | |
| m | 7475 | 1.8% | |
| c | 7345 | 1.8% | |
| u | 4850 | 1.2% | |
| p | 4320 | 1.0% | |
| g | 4220 | 1.0% | |
| Q | 4185 | 1.0% | |
| - | 4060 | 1.0% | |
| f | 3890 | 0.9% | |
| P | 3875 | 0.9% | |
| Other values (37) | 36300 | 8.7% |
Most frequent None characters
| Value | Count | Frequency (%) | |
| | 65 | 100.0% |
Basis Type
Categorical
| Distinct | 7 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 73.4 KiB |
| Capital Base | |
|---|---|
| Total | |
| Entire Net Income | |
| Alternative Minimum Tax | |
| Fixed Dollar Minimum Tax | |
| Other values (2) |
| Value | Count | Frequency (%) | |
| Capital Base | 1980 | 21.1% | |
| Total | 1980 | 21.1% | |
| Entire Net Income | 1980 | 21.1% | |
| Alternative Minimum Tax | 1010 | 10.8% | |
| Fixed Dollar Minimum Tax | 1010 | 10.8% | |
| Fixed Dollar Minimum | 970 | 10.3% | |
| Alternative Minimum Income | 450 | 4.8% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 26 |
|---|---|
| Median length | 17 |
| Mean length | 15.5533049 |
| Min length | 5 |
Most occurring characters
| Value | Count | Frequency (%) | |
| i | 14280 | 9.8% | |
| 13830 | 9.5% | ||
| a | 13380 | 9.2% | |
| e | 13270 | 9.1% | |
| t | 10840 | 7.4% | |
| l | 9380 | 6.4% | |
| n | 9310 | 6.4% | |
| m | 9310 | 6.4% | |
| o | 6390 | 4.4% | |
| r | 5420 | 3.7% | |
| x | 4000 | 2.7% | |
| T | 4000 | 2.7% | |
| M | 3440 | 2.4% | |
| u | 3440 | 2.4% | |
| I | 2430 | 1.7% | |
| c | 2430 | 1.7% | |
| E | 1980 | 1.4% | |
| N | 1980 | 1.4% | |
| F | 1980 | 1.4% | |
| d | 1980 | 1.4% | |
| D | 1980 | 1.4% | |
| C | 1980 | 1.4% | |
| p | 1980 | 1.4% | |
| B | 1980 | 1.4% | |
| s | 1980 | 1.4% | |
| Other values (2) | 2920 | 2.0% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Lowercase Letter | 108850 | 74.6% | |
| Uppercase Letter | 23210 | 15.9% | |
| Space Separator | 13830 | 9.5% |
Most frequent Uppercase Letter characters
| Value | Count | Frequency (%) | |
| T | 4000 | 17.2% | |
| M | 3440 | 14.8% | |
| I | 2430 | 10.5% | |
| E | 1980 | 8.5% | |
| N | 1980 | 8.5% | |
| F | 1980 | 8.5% | |
| D | 1980 | 8.5% | |
| C | 1980 | 8.5% | |
| B | 1980 | 8.5% | |
| A | 1460 | 6.3% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| i | 14280 | 13.1% | |
| a | 13380 | 12.3% | |
| e | 13270 | 12.2% | |
| t | 10840 | 10.0% | |
| l | 9380 | 8.6% | |
| n | 9310 | 8.6% | |
| m | 9310 | 8.6% | |
| o | 6390 | 5.9% | |
| r | 5420 | 5.0% | |
| x | 4000 | 3.7% | |
| u | 3440 | 3.2% | |
| c | 2430 | 2.2% | |
| d | 1980 | 1.8% | |
| p | 1980 | 1.8% | |
| s | 1980 | 1.8% | |
| v | 1460 | 1.3% |
Most frequent Space Separator characters
| Value | Count | Frequency (%) | |
| 13830 | 100.0% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 132060 | 90.5% | |
| Common | 13830 | 9.5% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| i | 14280 | 10.8% | |
| a | 13380 | 10.1% | |
| e | 13270 | 10.0% | |
| t | 10840 | 8.2% | |
| l | 9380 | 7.1% | |
| n | 9310 | 7.0% | |
| m | 9310 | 7.0% | |
| o | 6390 | 4.8% | |
| r | 5420 | 4.1% | |
| x | 4000 | 3.0% | |
| T | 4000 | 3.0% | |
| M | 3440 | 2.6% | |
| u | 3440 | 2.6% | |
| I | 2430 | 1.8% | |
| c | 2430 | 1.8% | |
| E | 1980 | 1.5% | |
| N | 1980 | 1.5% | |
| F | 1980 | 1.5% | |
| d | 1980 | 1.5% | |
| D | 1980 | 1.5% | |
| C | 1980 | 1.5% | |
| p | 1980 | 1.5% | |
| B | 1980 | 1.5% | |
| s | 1980 | 1.5% | |
| A | 1460 | 1.1% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| 13830 | 100.0% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 145890 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| i | 14280 | 9.8% | |
| 13830 | 9.5% | ||
| a | 13380 | 9.2% | |
| e | 13270 | 9.1% | |
| t | 10840 | 7.4% | |
| l | 9380 | 6.4% | |
| n | 9310 | 6.4% | |
| m | 9310 | 6.4% | |
| o | 6390 | 4.4% | |
| r | 5420 | 3.7% | |
| x | 4000 | 2.7% | |
| T | 4000 | 2.7% | |
| M | 3440 | 2.4% | |
| u | 3440 | 2.4% | |
| I | 2430 | 1.7% | |
| c | 2430 | 1.7% | |
| E | 1980 | 1.4% | |
| N | 1980 | 1.4% | |
| F | 1980 | 1.4% | |
| d | 1980 | 1.4% | |
| D | 1980 | 1.4% | |
| C | 1980 | 1.4% | |
| p | 1980 | 1.4% | |
| B | 1980 | 1.4% | |
| s | 1980 | 1.4% | |
| Other values (2) | 2920 | 2.0% |
| Distinct | 9 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 7193 |
| Missing (%) | 76.7% |
| Memory size | 73.4 KiB |
| d/ | |
|---|---|
| 1/ | |
| 3/ | 105 |
| 4/ | 86 |
| 1/, d/ | 60 |
| Other values (4) | 94 |
| Value | Count | Frequency (%) | |
| d/ | 1638 | 17.5% | |
| 1/ | 204 | 2.2% | |
| 3/ | 105 | 1.1% | |
| 4/ | 86 | 0.9% | |
| 1/, d/ | 60 | 0.6% | |
| 2/ | 51 | 0.5% | |
| d/, 4/ | 29 | 0.3% | |
| d/, 3/ | 10 | 0.1% | |
| 2/, d/ | 4 | < 0.1% | |
| (Missing) | 7193 | 76.7% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 6 |
|---|---|
| Median length | 3 |
| Mean length | 2.810767591 |
| Min length | 2 |
Most occurring characters
| Value | Count | Frequency (%) | |
| n | 14386 | 54.6% | |
| a | 7193 | 27.3% | |
| / | 2290 | 8.7% | |
| d | 1741 | 6.6% | |
| 1 | 264 | 1.0% | |
| 4 | 115 | 0.4% | |
| 3 | 115 | 0.4% | |
| , | 103 | 0.4% | |
| 103 | 0.4% | ||
| 2 | 55 | 0.2% |
Most occurring categories
| Value | Count | Frequency (%) | |
| Lowercase Letter | 23320 | 88.5% | |
| Other Punctuation | 2393 | 9.1% | |
| Decimal Number | 549 | 2.1% | |
| Space Separator | 103 | 0.4% |
Most frequent Lowercase Letter characters
| Value | Count | Frequency (%) | |
| n | 14386 | 61.7% | |
| a | 7193 | 30.8% | |
| d | 1741 | 7.5% |
Most frequent Other Punctuation characters
| Value | Count | Frequency (%) | |
| / | 2290 | 95.7% | |
| , | 103 | 4.3% |
Most frequent Space Separator characters
| Value | Count | Frequency (%) | |
| 103 | 100.0% |
Most frequent Decimal Number characters
| Value | Count | Frequency (%) | |
| 1 | 264 | 48.1% | |
| 4 | 115 | 20.9% | |
| 3 | 115 | 20.9% | |
| 2 | 55 | 10.0% |
Most occurring scripts
| Value | Count | Frequency (%) | |
| Latin | 23320 | 88.5% | |
| Common | 3045 | 11.5% |
Most frequent Latin characters
| Value | Count | Frequency (%) | |
| n | 14386 | 61.7% | |
| a | 7193 | 30.8% | |
| d | 1741 | 7.5% |
Most frequent Common characters
| Value | Count | Frequency (%) | |
| / | 2290 | 75.2% | |
| 1 | 264 | 8.7% | |
| 4 | 115 | 3.8% | |
| 3 | 115 | 3.8% | |
| , | 103 | 3.4% | |
| 103 | 3.4% | ||
| 2 | 55 | 1.8% |
Most occurring blocks
| Value | Count | Frequency (%) | |
| ASCII | 26365 | 100.0% |
Most frequent ASCII characters
| Value | Count | Frequency (%) | |
| n | 14386 | 54.6% | |
| a | 7193 | 27.3% | |
| / | 2290 | 8.7% | |
| d | 1741 | 6.6% | |
| 1 | 264 | 1.0% | |
| 4 | 115 | 0.4% | |
| 3 | 115 | 0.4% | |
| , | 103 | 0.4% | |
| 103 | 0.4% | ||
| 2 | 55 | 0.2% |
| Distinct | 604 |
|---|---|
| Distinct (%) | 7.7% |
| Missing | 1553 |
| Missing (%) | 16.6% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 73.38124441 |
|---|---|
| Minimum | 0 |
| Maximum | 5991 |
| Zeros | 3222 |
| Zeros (%) | 34.3% |
| Memory size | 73.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 6 |
| Q3 | 32 |
| 95-th percentile | 318.7 |
| Maximum | 5991 |
| Range | 5991 |
| Interquartile range (IQR) | 32 |
Descriptive statistics
| Standard deviation | 294.5836481 |
|---|---|
| Coefficient of variation (CV) | 4.01442699 |
| Kurtosis | 146.5852792 |
| Mean | 73.38124441 |
| Median Absolute Deviation (MAD) | 6 |
| Skewness | 10.20713178 |
| Sum | 574355 |
| Variance | 86779.52573 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 0 | 3222 | 34.3% | |
| 3 | 281 | 3.0% | |
| 4 | 193 | 2.1% | |
| 5 | 188 | 2.0% | |
| 7 | 185 | 2.0% | |
| 6 | 162 | 1.7% | |
| 9 | 142 | 1.5% | |
| 8 | 139 | 1.5% | |
| 10 | 121 | 1.3% | |
| 11 | 106 | 1.1% | |
| 12 | 106 | 1.1% | |
| 13 | 105 | 1.1% | |
| 16 | 81 | 0.9% | |
| 20 | 73 | 0.8% | |
| 18 | 69 | 0.7% | |
| 14 | 68 | 0.7% | |
| 17 | 62 | 0.7% | |
| 15 | 59 | 0.6% | |
| 19 | 53 | 0.6% | |
| 28 | 51 | 0.5% | |
| 22 | 48 | 0.5% | |
| 21 | 43 | 0.5% | |
| 27 | 42 | 0.4% | |
| 29 | 40 | 0.4% | |
| 31 | 39 | 0.4% | |
| Other values (579) | 2149 | 22.9% | |
| (Missing) | 1553 | 16.6% |
| Value | Count | Frequency (%) | |
| 0 | 3222 | 34.3% | |
| 3 | 281 | 3.0% | |
| 4 | 193 | 2.1% | |
| 5 | 188 | 2.0% | |
| 6 | 162 | 1.7% | |
| 7 | 185 | 2.0% | |
| 8 | 139 | 1.5% | |
| 9 | 142 | 1.5% | |
| 10 | 121 | 1.3% | |
| 11 | 106 | 1.1% |
| Value | Count | Frequency (%) | |
| 5991 | 1 | < 0.1% | |
| 5849 | 1 | < 0.1% | |
| 5783 | 1 | < 0.1% | |
| 5654 | 1 | < 0.1% | |
| 5542 | 1 | < 0.1% | |
| 5305 | 1 | < 0.1% | |
| 5161 | 1 | < 0.1% | |
| 4934 | 1 | < 0.1% | |
| 3221 | 1 | < 0.1% | |
| 3091 | 1 | < 0.1% |
| Distinct | 4140 |
|---|---|
| Distinct (%) | 52.9% |
| Missing | 1553 |
| Missing (%) | 16.6% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 28533942.18 |
|---|---|
| Minimum | 0 |
| Maximum | 2261711738 |
| Zeros | 3222 |
| Zeros (%) | 34.3% |
| Memory size | 73.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 48726 |
| Q3 | 3216924.5 |
| 95-th percentile | 93579494.7 |
| Maximum | 2261711738 |
| Range | 2261711738 |
| Interquartile range (IQR) | 3216924.5 |
Descriptive statistics
| Standard deviation | 147199992.3 |
|---|---|
| Coefficient of variation (CV) | 5.158768156 |
| Kurtosis | 83.1283964 |
| Mean | 28533942.18 |
| Median Absolute Deviation (MAD) | 48726 |
| Skewness | 8.428285329 |
| Sum | 2.233351654e+11 |
| Variance | 2.166783772e+16 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 0 | 3222 | 34.3% | |
| 8000 | 4 | < 0.1% | |
| 516183 | 4 | < 0.1% | |
| 464631 | 4 | < 0.1% | |
| 16000 | 4 | < 0.1% | |
| 190591 | 4 | < 0.1% | |
| 14081 | 4 | < 0.1% | |
| 1847764 | 4 | < 0.1% | |
| 93000 | 3 | < 0.1% | |
| 11274 | 3 | < 0.1% | |
| 26478 | 3 | < 0.1% | |
| 32105479 | 3 | < 0.1% | |
| 5407729 | 3 | < 0.1% | |
| 321609 | 3 | < 0.1% | |
| 3909878 | 3 | < 0.1% | |
| 1042797 | 3 | < 0.1% | |
| 3754885 | 3 | < 0.1% | |
| 57500 | 3 | < 0.1% | |
| 8927 | 3 | < 0.1% | |
| 7000 | 3 | < 0.1% | |
| 2000 | 3 | < 0.1% | |
| 6399 | 3 | < 0.1% | |
| 763412 | 3 | < 0.1% | |
| 3247179 | 3 | < 0.1% | |
| 6320 | 3 | < 0.1% | |
| Other values (4115) | 4526 | 48.3% | |
| (Missing) | 1553 | 16.6% |
| Value | Count | Frequency (%) | |
| 0 | 3222 | 34.3% | |
| 82 | 1 | < 0.1% | |
| 197 | 1 | < 0.1% | |
| 218 | 1 | < 0.1% | |
| 287 | 2 | < 0.1% | |
| 378 | 1 | < 0.1% | |
| 603 | 2 | < 0.1% | |
| 954 | 1 | < 0.1% | |
| 1024 | 1 | < 0.1% | |
| 1099 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 2261711738 | 1 | < 0.1% | |
| 2216301425 | 1 | < 0.1% | |
| 2184814005 | 1 | < 0.1% | |
| 2121155257 | 1 | < 0.1% | |
| 2012336023 | 1 | < 0.1% | |
| 1980868488 | 1 | < 0.1% | |
| 1924587434 | 1 | < 0.1% | |
| 1828353542 | 1 | < 0.1% | |
| 1784825507 | 1 | < 0.1% | |
| 1686467520 | 1 | < 0.1% |
| Distinct | 2409 |
|---|---|
| Distinct (%) | 30.8% |
| Missing | 1553 |
| Missing (%) | 16.6% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 31.50383289 |
|---|---|
| Minimum | 0 |
| Maximum | 100 |
| Zeros | 3225 |
| Zeros (%) | 34.4% |
| Memory size | 73.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 6.29 |
| Q3 | 66.785 |
| 95-th percentile | 100 |
| Maximum | 100 |
| Range | 100 |
| Interquartile range (IQR) | 66.785 |
Descriptive statistics
| Standard deviation | 40.03537842 |
|---|---|
| Coefficient of variation (CV) | 1.270809763 |
| Kurtosis | -1.008851353 |
| Mean | 31.50383289 |
| Median Absolute Deviation (MAD) | 6.29 |
| Skewness | 0.843625508 |
| Sum | 246580.5 |
| Variance | 1602.831525 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 0 | 3225 | 34.4% | |
| 100 | 1432 | 15.3% | |
| 0.19 | 6 | 0.1% | |
| 5.94 | 6 | 0.1% | |
| 4.62 | 6 | 0.1% | |
| 1.7 | 6 | 0.1% | |
| 7.42 | 6 | 0.1% | |
| 0.12 | 5 | 0.1% | |
| 7.39 | 5 | 0.1% | |
| 0.14 | 5 | 0.1% | |
| 0.72 | 5 | 0.1% | |
| 0.27 | 5 | 0.1% | |
| 16.62 | 5 | 0.1% | |
| 5.6 | 4 | < 0.1% | |
| 3.13 | 4 | < 0.1% | |
| 25.47 | 4 | < 0.1% | |
| 33.34 | 4 | < 0.1% | |
| 33.2 | 4 | < 0.1% | |
| 4.57 | 4 | < 0.1% | |
| 99.84 | 4 | < 0.1% | |
| 1.1 | 4 | < 0.1% | |
| 0.08 | 4 | < 0.1% | |
| 0.47 | 4 | < 0.1% | |
| 0.06 | 4 | < 0.1% | |
| 10.79 | 4 | < 0.1% | |
| Other values (2384) | 3062 | 32.6% | |
| (Missing) | 1553 | 16.6% |
| Value | Count | Frequency (%) | |
| 0 | 3225 | 34.4% | |
| 0.01 | 3 | < 0.1% | |
| 0.02 | 3 | < 0.1% | |
| 0.03 | 3 | < 0.1% | |
| 0.04 | 2 | < 0.1% | |
| 0.05 | 3 | < 0.1% | |
| 0.06 | 4 | < 0.1% | |
| 0.07 | 3 | < 0.1% | |
| 0.08 | 4 | < 0.1% | |
| 0.09 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 100 | 1432 | 15.3% | |
| 99.98 | 1 | < 0.1% | |
| 99.87 | 1 | < 0.1% | |
| 99.84 | 4 | < 0.1% | |
| 99.81 | 1 | < 0.1% | |
| 99.68 | 1 | < 0.1% | |
| 99.65 | 2 | < 0.1% | |
| 99.63 | 1 | < 0.1% | |
| 99.52 | 1 | < 0.1% | |
| 99.47 | 2 | < 0.1% |
| Distinct | 3264 |
|---|---|
| Distinct (%) | 42.7% |
| Missing | 1741 |
| Missing (%) | 18.6% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 61266.91465 |
|---|---|
| Minimum | 0 |
| Maximum | 21527410 |
| Zeros | 3222 |
| Zeros (%) | 34.3% |
| Memory size | 73.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1202 |
| Q3 | 14775 |
| 95-th percentile | 167795 |
| Maximum | 21527410 |
| Range | 21527410 |
| Interquartile range (IQR) | 14775 |
Descriptive statistics
| Standard deviation | 543659.5772 |
|---|---|
| Coefficient of variation (CV) | 8.873624211 |
| Kurtosis | 907.280516 |
| Mean | 61266.91465 |
| Median Absolute Deviation (MAD) | 1202 |
| Skewness | 27.66914305 |
| Sum | 468017961 |
| Variance | 2.955657359e+11 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 0 | 3222 | 34.3% | |
| 2100 | 66 | 0.7% | |
| 500 | 53 | 0.6% | |
| 1250 | 35 | 0.4% | |
| 1000 | 31 | 0.3% | |
| 2500 | 19 | 0.2% | |
| 10000 | 15 | 0.2% | |
| 750 | 13 | 0.1% | |
| 5000 | 11 | 0.1% | |
| 6000 | 9 | 0.1% | |
| 7000 | 8 | 0.1% | |
| 11000 | 7 | 0.1% | |
| 25 | 6 | 0.1% | |
| 15000 | 6 | 0.1% | |
| 1777 | 6 | 0.1% | |
| 1351 | 6 | 0.1% | |
| 3000 | 6 | 0.1% | |
| 1875 | 6 | 0.1% | |
| 1940 | 6 | 0.1% | |
| 8000 | 6 | 0.1% | |
| 2133 | 6 | 0.1% | |
| 50198 | 5 | 0.1% | |
| 16500 | 5 | 0.1% | |
| 162870 | 5 | 0.1% | |
| 2190 | 5 | 0.1% | |
| Other values (3239) | 4076 | 43.5% | |
| (Missing) | 1741 | 18.6% |
| Value | Count | Frequency (%) | |
| 0 | 3222 | 34.3% | |
| 20 | 1 | < 0.1% | |
| 21 | 1 | < 0.1% | |
| 22 | 1 | < 0.1% | |
| 25 | 6 | 0.1% | |
| 38 | 1 | < 0.1% | |
| 44 | 1 | < 0.1% | |
| 47 | 1 | < 0.1% | |
| 53 | 1 | < 0.1% | |
| 60 | 4 | < 0.1% |
| Value | Count | Frequency (%) | |
| 21527410 | 1 | < 0.1% | |
| 18847173 | 1 | < 0.1% | |
| 18408476 | 1 | < 0.1% | |
| 16415907 | 1 | < 0.1% | |
| 15169033 | 1 | < 0.1% | |
| 13045162 | 1 | < 0.1% | |
| 5573045 | 1 | < 0.1% | |
| 4950000 | 1 | < 0.1% | |
| 4949728 | 2 | < 0.1% | |
| 4140196 | 1 | < 0.1% |
| Distinct | 4055 |
|---|---|
| Distinct (%) | 51.8% |
| Missing | 1553 |
| Missing (%) | 16.6% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 354169.2655 |
|---|---|
| Minimum | 0 |
| Maximum | 109656694 |
| Zeros | 3222 |
| Zeros (%) | 34.3% |
| Memory size | 73.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 3610 |
| Q3 | 101059.5 |
| 95-th percentile | 1381621.3 |
| Maximum | 109656694 |
| Range | 109656694 |
| Interquartile range (IQR) | 101059.5 |
Descriptive statistics
| Standard deviation | 2276549.145 |
|---|---|
| Coefficient of variation (CV) | 6.427856301 |
| Kurtosis | 1376.494345 |
| Mean | 354169.2655 |
| Median Absolute Deviation (MAD) | 3610 |
| Skewness | 30.7762013 |
| Sum | 2772082841 |
| Variance | 5.182676009e+12 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 0 | 3222 | 34.3% | |
| 800 | 5 | 0.1% | |
| 6159 | 5 | 0.1% | |
| 2333 | 4 | < 0.1% | |
| 263966 | 4 | < 0.1% | |
| 154877 | 4 | < 0.1% | |
| 2976 | 4 | < 0.1% | |
| 14661 | 4 | < 0.1% | |
| 2255 | 4 | < 0.1% | |
| 2133 | 4 | < 0.1% | |
| 172061 | 4 | < 0.1% | |
| 2100 | 4 | < 0.1% | |
| 2850 | 3 | < 0.1% | |
| 2469652 | 3 | < 0.1% | |
| 177359 | 3 | < 0.1% | |
| 67846 | 3 | < 0.1% | |
| 1429 | 3 | < 0.1% | |
| 977470 | 3 | < 0.1% | |
| 9643 | 3 | < 0.1% | |
| 998 | 3 | < 0.1% | |
| 1899 | 3 | < 0.1% | |
| 5473 | 3 | < 0.1% | |
| 3759 | 3 | < 0.1% | |
| 9082 | 3 | < 0.1% | |
| 2311 | 3 | < 0.1% | |
| Other values (4030) | 4520 | 48.2% | |
| (Missing) | 1553 | 16.6% |
| Value | Count | Frequency (%) | |
| 0 | 3222 | 34.3% | |
| 21 | 1 | < 0.1% | |
| 41 | 2 | < 0.1% | |
| 44 | 1 | < 0.1% | |
| 66 | 1 | < 0.1% | |
| 76 | 1 | < 0.1% | |
| 144 | 1 | < 0.1% | |
| 191 | 1 | < 0.1% | |
| 201 | 2 | < 0.1% | |
| 220 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 109656694 | 2 | < 0.1% | |
| 30474305 | 1 | < 0.1% | |
| 27771484 | 1 | < 0.1% | |
| 25001715 | 1 | < 0.1% | |
| 23874523 | 1 | < 0.1% | |
| 22657901 | 1 | < 0.1% | |
| 22455393 | 1 | < 0.1% | |
| 19700587 | 1 | < 0.1% | |
| 18983604 | 1 | < 0.1% | |
| 18656054 | 1 | < 0.1% |
Group Sort Order
Real number (ℝ≥0)
| Distinct | 5 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 212.9211087 |
|---|---|
| Minimum | 1 |
| Maximum | 999 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 73.4 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 2 |
| median | 3 |
| Q3 | 5 |
| 95-th percentile | 999 |
| Maximum | 999 |
| Range | 998 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 406.6380856 |
|---|---|
| Coefficient of variation (CV) | 1.90980635 |
| Kurtosis | 0.005542608567 |
| Mean | 212.9211087 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 1.416151608 |
| Sum | 1997200 |
| Variance | 165354.5326 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=5)
| Value | Count | Frequency (%) | |
| 999 | 1980 | 21.1% | |
| 3 | 1980 | 21.1% | |
| 2 | 1980 | 21.1% | |
| 1 | 1980 | 21.1% | |
| 5 | 1460 | 15.6% |
| Value | Count | Frequency (%) | |
| 1 | 1980 | 21.1% | |
| 2 | 1980 | 21.1% | |
| 3 | 1980 | 21.1% | |
| 5 | 1460 | 15.6% | |
| 999 | 1980 | 21.1% |
| Value | Count | Frequency (%) | |
| 999 | 1980 | 21.1% | |
| 5 | 1460 | 15.6% | |
| 3 | 1980 | 21.1% | |
| 2 | 1980 | 21.1% | |
| 1 | 1980 | 21.1% |
Credit Type Sort Order
Real number (ℝ≥0)
| Distinct | 5 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.232729211 |
|---|---|
| Minimum | 1 |
| Maximum | 8 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 73.4 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 5 |
| median | 6 |
| Q3 | 7 |
| 95-th percentile | 8 |
| Maximum | 8 |
| Range | 7 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 2.474627665 |
|---|---|
| Coefficient of variation (CV) | 0.4729133814 |
| Kurtosis | -0.742648609 |
| Mean | 5.232729211 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | -0.7846569656 |
| Sum | 49083 |
| Variance | 6.12378208 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=5)
| Value | Count | Frequency (%) | |
| 6 | 2115 | 22.5% | |
| 1 | 2096 | 22.3% | |
| 5 | 1841 | 19.6% | |
| 8 | 1796 | 19.1% | |
| 7 | 1532 | 16.3% |
| Value | Count | Frequency (%) | |
| 1 | 2096 | 22.3% | |
| 5 | 1841 | 19.6% | |
| 6 | 2115 | 22.5% | |
| 7 | 1532 | 16.3% | |
| 8 | 1796 | 19.1% |
| Value | Count | Frequency (%) | |
| 8 | 1796 | 19.1% | |
| 7 | 1532 | 16.3% | |
| 6 | 2115 | 22.5% | |
| 5 | 1841 | 19.6% | |
| 1 | 2096 | 22.3% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| Tax Year | Tax Article | Credit Type | Credit Name | Basis Type | Notes | Number of Taxpayers | Amount of Credit | Percent of Credit | Median Amount of Credit | Mean Amount of Credit | Group Sort Order | Credit Type Sort Order | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2015 | 9A | Credit Earned | Alternative Fuels and Electric Vehicle Recharging Property Credit | Entire Net Income | d/ | NaN | NaN | NaN | NaN | NaN | 1 | 1 |
| 1 | 2015 | 9A | Credit Earned | Alternative Fuels and Electric Vehicle Recharging Property Credit | Fixed Dollar Minimum | NaN | 3.0 | 28478.0 | 71.43 | 10000.0 | 9493.0 | 2 | 1 |
| 2 | 2015 | 9A | Credit Earned | Alternative Fuels and Electric Vehicle Recharging Property Credit | Capital Base | d/ | NaN | NaN | NaN | NaN | NaN | 3 | 1 |
| 3 | 2015 | 9A | Credit Earned | Alternative Fuels and Electric Vehicle Recharging Property Credit | Total | NaN | 5.0 | 39866.0 | 100.00 | 10000.0 | 7973.0 | 999 | 1 |
| 4 | 2015 | 9A | Credit Claimed | Alternative Fuels and Electric Vehicle Recharging Property Credit | Entire Net Income | d/ | NaN | NaN | NaN | NaN | NaN | 1 | 5 |
| 5 | 2015 | 9A | Credit Claimed | Alternative Fuels and Electric Vehicle Recharging Property Credit | Fixed Dollar Minimum | NaN | 3.0 | 43478.0 | 79.24 | 10000.0 | 14493.0 | 2 | 5 |
| 6 | 2015 | 9A | Credit Claimed | Alternative Fuels and Electric Vehicle Recharging Property Credit | Capital Base | d/ | NaN | NaN | NaN | NaN | NaN | 3 | 5 |
| 7 | 2015 | 9A | Credit Claimed | Alternative Fuels and Electric Vehicle Recharging Property Credit | Total | NaN | 5.0 | 54866.0 | 100.00 | 10000.0 | 10973.0 | 999 | 5 |
| 8 | 2015 | 9A | Credit Used | Alternative Fuels and Electric Vehicle Recharging Property Credit | Entire Net Income | d/ | NaN | NaN | NaN | NaN | NaN | 1 | 6 |
| 9 | 2015 | 9A | Credit Used | Alternative Fuels and Electric Vehicle Recharging Property Credit | Fixed Dollar Minimum | NaN | 0.0 | 0.0 | 0.00 | 0.0 | 0.0 | 2 | 6 |
Last rows
| Tax Year | Tax Article | Credit Type | Credit Name | Basis Type | Notes | Number of Taxpayers | Amount of Credit | Percent of Credit | Median Amount of Credit | Mean Amount of Credit | Group Sort Order | Credit Type Sort Order | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 9370 | 2016 | 9A | Credit Used | Workers with Disabilities Tax Credit | Capital Base | NaN | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 3 | 6 |
| 9371 | 2016 | 9A | Credit Used | Workers with Disabilities Tax Credit | Total | d/ | NaN | NaN | NaN | NaN | NaN | 999 | 6 |
| 9372 | 2016 | 9A | Credit Refunded | Workers with Disabilities Tax Credit | Entire Net Income | NaN | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1 | 7 |
| 9373 | 2016 | 9A | Credit Refunded | Workers with Disabilities Tax Credit | Fixed Dollar Minimum | NaN | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 2 | 7 |
| 9374 | 2016 | 9A | Credit Refunded | Workers with Disabilities Tax Credit | Capital Base | NaN | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 3 | 7 |
| 9375 | 2016 | 9A | Credit Refunded | Workers with Disabilities Tax Credit | Total | NaN | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 999 | 7 |
| 9376 | 2016 | 9A | Credit Carried Forward | Workers with Disabilities Tax Credit | Entire Net Income | NaN | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1 | 8 |
| 9377 | 2016 | 9A | Credit Carried Forward | Workers with Disabilities Tax Credit | Fixed Dollar Minimum | d/ | NaN | NaN | NaN | NaN | NaN | 2 | 8 |
| 9378 | 2016 | 9A | Credit Carried Forward | Workers with Disabilities Tax Credit | Capital Base | NaN | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 3 | 8 |
| 9379 | 2016 | 9A | Credit Carried Forward | Workers with Disabilities Tax Credit | Total | d/ | NaN | NaN | NaN | NaN | NaN | 999 | 8 |